Simultaneous Query Pipelines in QPipe
نویسندگان
چکیده
Data warehousing and scientific database applications operate on massive datasets and are characterized by complex queries accessing large portions of the database. Concurrent queries often exhibit high data and computation overlap, e.g., they access the same relations on disk, compute similar aggregates, or share intermediate results. Unfortunately, run-time sharing in modern database engines is limited by the paradigm of invoking an independent set of operator instances per query, potentially missing sharing opportunities if the buffer pool evicts data early. QPipe is a new, operator-centric, relational query engine that can detect and exploit overlap across concurrent queries, at run time [1]. In QPipe, each relational operator is promoted to an independent micro-engine (μEngine) with its own resource management and runtime support. Incoming queries break up into as many tasks (or query packets) as the nodes of the compiled query tree plan, and queue up in front of each μEngine. Since query packets are self-contained requests, QPipe allows external applications to submit custom query packets, bypassing the regular parsing and optimizing phases. Under regular query execution, μEngines work independently and evaluate each query in parallel. Data flow for each query between μEngines occurs through dedicated tuple buffers. μEngines continuously monitor their queue to detect data and work sharing opportunities across different queries. Once such an opportunity is detected, only one query packet remains active, performing the overlapping operation, while the results are simultaneously pipelined to all consuming queries. More details about sharing opportunities and the implementation of simultaneous pipelining can be found elsewhere [1]. This system demonstration exposes the key novel features of QPipe, and also provides an intuitive way of visualizing query execution inside the database engine. Unlike modern commercial engines which are typically demonstrated as “black boxes,” due to tight integration of system components and query processing algorithms, QPipe naturally breaks up query execution stages, allowing for a visually appealing demonstration. We organize the demo storyline into three parts: • Resource utilization and query progress. For a single query, we show which μEngines are working in parallel, the progress of the query, and the tuple flow between relational operators. For multiple concurrent queries, we show queue status and thread assignment at each μEngine, along with opportunities for sharing data or work across queries. • Simultaneously pipelined query execution. By switching QPipe to simultaneous pipelining mode, we demonstrate the entire procedure of sharing overlapping table scans or intermediate result computation across different queries. All possible run-time actions of QPipe are demonstrated through pre-designed scenarios. • Ad-hoc query pipelines. QPipe allows external applications to submit ad-hoc query packets. The user is able to construct and submit custom packets to QPipe through a graphical user interface, and observe how these are being evaluated.
منابع مشابه
Sharing data and work across queries in analytical workloads
Traditionally, query execution engines in relational databases have followed a query-centric model: They optimize and execute each incoming query using a separate execution plan, independent of other concurrent queries. For workloads with low contention for resources, or workloads with short-lived queries, this model makes the optimization phase faster and creates efficient execution plans. For...
متن کاملControl of Scour around the Submarine Pipelines in the River Bed
This paper presents, a set of experiments was conducted in a Plexiglas flume setup to investigate the simultaneous use of pipe and blade in the scour control around a submerged pipe. A vertical cylindrical pipe with and without a blade placed below it was selected for the experiments. The experiments were carried out in two different sates: with blade and without blade the pipe. Moreover, they ...
متن کاملHardware-Sensitive Scan Operator Variants for Compiled Selection Pipelines
The ever-increasing demand for performance on huge data sets forces database systems to tweak the last bit of performance out of their operators. Especially query compiled plans allow for several tuning opportunities that can be applied depending on the query plan and the underlying data. Apart from classical query optimization opportunities, it includes to tune the code using code optimization...
متن کاملHolistic Query Evaluation over Information Extraction Pipelines
We introduce holistic in-database query processing over information extraction pipelines. This requires considering the joint conditional distribution over generic Conditional Random Fields that uses factor graphs to encode extraction tasks. Our approach introduces Canopy Factor Graphs, a novel probabilistic model for effectively capturing the joint conditional distribution given a canopy clust...
متن کاملInvestigation of the Effect of Local Buckling and VIV Fatigue on Failure Probability of Subsea Pipelines in Iranian South Pars Gas Field
Free-span occurs normally in a pipeline at uneven seabed, dynamic seabed and pipeline crossing. Free spanning in pipeline causes Vortex Induced Vibration (VIV) fatigue, fracture and bursting. In this paper, a pipeline located in South Pars Gas Field is assessed against local buckling and VIV fatigue using probability of failure theory based on the recommended methodology by Det Norske Veritas (...
متن کامل